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Abstract 


The outstanding human cognitive capacities are computed in the cerebral cortex, a mammalian-specific brain region 
and the place of massive biological innovation. Long noncoding RNAs have emerged as gene regulatory elements with 
higher evolutionary turnover than mRNAs. The many long noncoding RNAs identified in neural tissues make them 
candidates for molecular sources of cerebral cortex evolution and disease. Here, we characterized the genomic and 
cellular shifts that occurred during the evolution of the long noncoding RNA repertoire expressed in the developing 
cerebral cortex and explored putative roles for these long noncoding RNAs in the evolution of the human brain. Using 
transcriptomics and comparative genomics, we comprehensively annotated the cortical transcriptomes of humans, 
rhesus macaques, mice, and chickens and classified human cortical long noncoding RNAs into evolutionary groups as 
a function of their predicted minimal ages. Long noncoding RNA evolutionary groups showed differences in expres- 
sion levels, splicing efficiencies, transposable element contents, genomic distributions, and transcription factor bind- 
ing to their promoters. Furthermore, older long noncoding RNAs showed preferential expression in germinative 
zones, outer radial glial cells, and cortical inhibitory (GABAergic) neurons. In comparison, younger long noncoding 
RNAs showed preferential expression in cortical excitatory (glutamatergic) neurons, were enriched in primate and 
human-specific gene co-expression modules, and were dysregulated in neurodevelopmental disorders. These results 
suggest different evolutionary routes for older and younger cortical long noncoding RNAs, highlighting old long non- 
coding RNAs as a possible source of molecular evolution of conserved developmental programs; conversely, we pro- 
pose that the de novo expression of primate- and human-specific young long noncoding RNAs is a putative source of 
molecular evolution and dysfunction of cortical excitatory neurons, warranting further investigation. 
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Introduction et al. 2011; Silbereis John et al. 2016). Primates present an 
expanded brain with more total neurons than most mam- 
malian species. The human cerebral cortex has further ex- 
panded, differentiating us from our closest living relatives. 
These expansions and the diversification of neuronal cell 
types are likely responsible for the computational capaci- 
ties and unparalleled cognition of humans (Berg et al. 


The cerebral cortex is a primary information-processing 
center of the central nervous system that is crucial to 
the evolution of higher cognition and is affected by neuro- 
developmental disorders. It comprises billions of excitatory 
projection neurons (glutamatergic) and inhibitory inter- 


neurons (GABAergic) assembled in local circuits inter- 2021). 

twined with glial and vascular cells arranged in a At the cellular level, the human developing cerebral cor- 
six-layered architecture on the outer surface of the mam- tex presents an augmented proliferative capacity of neural 
malian brain (Molnar et al. 2019; Libé-Philippot and progenitors (radial glial cells [RGCs]), especially from the 
Vanderhaeghen 2021). The cerebral cortex evolved from outer subventricular zone (outer RGCs [oRGCs]; Lui 
the dorsal pallium after the divergence of mammals and et al. 2011), as well as an improvement in the information 
sauropsids (reptiles and birds) approximately 300 million processing capability of mature excitatory neurons (Berg 
years ago (Mya), and it is endowed with incredible plasti- et al. 2021). Understanding the molecular basis of these 
city, evident in the diverse neocortical sizes and shapes (Lui differences is critical to unveiling the evolution of human 
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higher cognition and having a deeper comprehension of 
how they are disrupted in diseases (Silbereis John et al. 
2016). In this line, a considerable effort has been made in 
the past decade to identify those changes, finding that du- 
plications of protein-coding genes and modifications in 
gene regulatory regions have altered the transcriptional 
landscape of different cell populations in the developing 
cerebral cortex (Libé-Philippot and Vanderhaeghen 2021; 
Vanderhaeghen and Polleux 2023). This extensive work 
has mainly focused on changes in the expression of 
protein-coding genes; expanding this analysis to the hu- 
man noncoding transcriptome is crucial to improve our 
understanding of the gene regulatory modifications that 
have led to the evolution of the human cerebral cortex. 

Long noncoding RNAs (IncRNAs) are noncoding genes 
transcribed into RNAs longer than 200 nucleotides that 
do not translate into functional proteins. This heteroge- 
neous group of RNAs is transcribed by RNA pol Il and 
shares molecular features with mRNAs, such as being 5’ 
capped, spliced, and polyadenylated; despite the molecular 
similarities with mRNAs, IncRNAs also present features 
that differentiate them, including higher tissue specificity, 
distinct chromatin modifications at the promoter region, 
cell nucleus enrichment, inefficient splicing, and less stabil- 
ity compared with mRNAs (Ransohoff et al. 2018; Statello 
et al. 2021). Although these features may point to IncRNAs 
as mere transcriptional noise, it has been shown that at 
least a fraction of IncRNAs and the act of their transcrip- 
tion have gene regulatory functions (Rinn and Chang 
2020). Interestingly, neural tissues express a significant 
number of IncRNAs in tetrapods (Necsulea et al. 2014; 
Hezroni et al. 2015; Sarropoulos et al. 2019), and several 
IncRNAs have been characterized as functional regulatory 
RNAs of different stages of the cerebral cortex develop- 
ment (Aprea and Calegari 2015). 

Unlike protein-coding genes that have evolved mainly 
by gene duplications, IncRNAs have preferentially evolved 
by de novo expression and exonization mediated by trans- 
posable elements (TEs; Kapusta et al. 2013). The de novo 
expression and the significant contributions of TEs to 
the evolution of IncRNAs explain the reduced constraint 
under which IncRNAs evolved compared with protein- 
coding genes (Kapusta et al. 2013). In addition, it has 
been shown in mammals that IncRNAs are a source of cel- 
lular plasticity due to their capacity to acquire new func- 
tional modalities (Guo et al. 2020), and some IncRNAs 
are a source of newly identified peptides (Ruiz-Orera 
et al. 2014). Interestingly, the first highly evolving human- 
specific region (human accelerated region [HAR]) was 
identified inside the IncRNA HAR7F, expressed in the de- 
veloping cerebral cortex (Pollard et al. 2006). This faster 
evolutionary turnover of IncRNAs compared with 
mRNAs, their gene regulatory functions, and elevated ex- 
pression in neural tissues make IncRNAs good candidates 
for molecular drivers of biological innovations in the con- 
text of cerebral cortex evolution. 

Here, we used transcriptomics and comparative genom- 
ics to characterize the evolution of the IncRNA repertoire 
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of the developing cerebral cortex. Our analyses identified 
signatures of different evolutionary routes by which 
IncRNAs were born throughout human evolution. They 
suggested that old and young IncRNAs are possible sources 
of molecular innovation of conserved and divergent devel- 
opmental programs in the cerebral cortex, respectively. 


Results 


Assemblies of New Comprehensive Transcriptomes 
Improve the Annotation of IncRNA Genes in Humans 
and Other Three Vertebrate Model Organisms 

To properly characterize the evolution of human cortical 
IncRNAs, we set out to compare the human IncRNA rep- 
ertory to those of three other vertebrate species (rhesus 
macaque, mouse, and chicken) that helps to recapitulate 
the evolutionary history of the human cerebral cortex. 
The vertebrate nervous system contains more specific 
IncRNAs than most other body tissues (Necsulea et al. 
2014; Hezroni et al. 2015; Sarropoulos et al. 2019). 
However, the lack of extensive RNA-seq libraries from 
the prenatal cerebral cortex has hampered the reconstruc- 
tion of more representative IncRNA gene models for this 
brain region. IncRNAs are more tissue specific and ex- 
pressed at lower levels than mRNAs, and many cell types 
(particularly those that are rare or found in early embryon- 
ic stages) have not yet been thoroughly interrogated by 
RNA-seq (Ulitsky 2016). Consequently, to avoid misidenti- 
fying homologous IncRNAs due to differences in the com- 
pleteness of the human, rhesus macaque, mouse, and 
chicken transcriptome annotations, we generated and an- 
notated new comprehensive catalogs of IncRNAs for all 
four species. 

We gathered extensive RNA-seq libraries from healthy 
tissues encompassing different stages of cerebral cortex 
and cerebral pallium development. This collection of 
RNA-seq libraries includes recently published data sets 
from humans, rhesus macaques, and mice (Li et al. 2018; 
Zhu et al. 2018; Sarropoulos et al. 2019; supplementary 
fig. Sla to c and table $1, Supplementary Material online); 
additionally, we generated new bulk RNA-seq data from 
the chicken pallium (supplementary fig. Sid and table S1, 
Supplementary Material online). We developed a 
genome-assisted approach that integrates efficient and 
more accurate bioinformatics tools that we applied to 
the collected short-read RNA-seq libraries (see Materials 
and Methods and supplementary fig. $2a, Supplementary 
Material online). In brief, we mapped bulk RNA-seq short 
reads to the genome with STAR (Dobin et al. 2013), using 
the latest available genomes as references (hg38, 
rheMac10 [Warren et al. 2020], mm39 [https://www.ncbi. 
nlm.nih.gov/grc/mouse], and galGal6é [Bellott et al. 
2017]), most of which were based on long-read sequencing. 
We removed RNA-seq libraries with high 3’ bias (see 
Materials and Methods and supplementary figs. S2a and 
S3a to d, Supplementary Material online), and filtered li- 
braries were used to assemble new transcriptome models 
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